Skip to content

fix: expand submit-mode blocking pattern detection#2804

Draft
yyq1043-cloud wants to merge 2 commits into
soxoj:mainfrom
yyq1043-cloud:improve-blocking-detection
Draft

fix: expand submit-mode blocking pattern detection#2804
yyq1043-cloud wants to merge 2 commits into
soxoj:mainfrom
yyq1043-cloud:improve-blocking-detection

Conversation

@yyq1043-cloud

@yyq1043-cloud yyq1043-cloud commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Expand the blocking detection logic in submit mode to catch more anti-bot patterns with clear, specific error messages.

Before: Only detected 3 patterns (/cdn-cgi/challenge-platform, \t\t\t\tnow:, Sorry, you have been blocked), always returned generic "Cloudflare detected" message.

After: Loop over named blocking_patterns list covering 6 patterns (Cloudflare variants, generic blocks, access denied, CF headers), returns a specific message including HTTP status codes for debugging.

Changes

  • maigret/submit.py: Replace single-if blocking check with structured pattern loop

Example

# Before
Cloudflare detected, skipping   (no status codes)

# After
Cloudflare challenge detected (HTTP 403/403), skipping

@yyq1043-cloud

Copy link
Copy Markdown
Contributor Author

Hi! I'd like to request review for this PR: Improve blocking pattern detection in submit mode
Requesting review from: @soxoj
Thank you! 🙂

@soxoj

soxoj commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Thanks for the PR! A few blockers before this can be merged:

  1. Scope mismatch. Title and Complete --submit mode: urlProbe, activation, cookies, status_codes, update-existing #2668 say "submit mode blocking detection", but 2 of 3 files (simple_report.tpl, simple_report_pdf.tpl) add web.archive.org / archive.is links to HTML/PDF reports — unrelated. Please split into two PRs.

  2. Wrong pattern. "CF-Chl-Alg-List:" is an HTTP header name, but first_html_response is the response body — headers won't appear there, so this check never fires. Also, "Now checking your browser" is labeled "Cloudflare turnstile", but that string is the old interstitial challenge page; Turnstile is a widget with different markers.

  3. Silent regression. The original "\t\t\t\tnow: " pattern is dropped without explanation — please keep it or justify removal.

  4. No tests for the new patterns.

  5. English-only markers. Many sites in maigret are non-English; consider whether matching only English strings is enough, or if these need locale-agnostic signals.

@soxoj soxoj marked this pull request as draft June 29, 2026 13:50
@soxoj soxoj changed the title Improve blocking pattern detection in submit mode [WIP] Improve blocking pattern detection in submit mode Jun 29, 2026
@yyq1043-cloud

Copy link
Copy Markdown
Contributor Author

Hi! I'd like to request review for this PR: [WIP] Improve blocking pattern detection in submit mode
Requesting review from: @soxoj
Thank you! 🙂

@yyq1043-cloud yyq1043-cloud force-pushed the improve-blocking-detection branch from d2a7311 to 48ac0cb Compare July 1, 2026 02:39
@yyq1043-cloud yyq1043-cloud changed the title [WIP] Improve blocking pattern detection in submit mode fix: expand submit-mode blocking pattern detection Jul 1, 2026
@yyq1043-cloud yyq1043-cloud force-pushed the improve-blocking-detection branch from 48ac0cb to 76464c1 Compare July 1, 2026 02:51
Replace the single-if blocking check with a loop over named patterns
for better extensibility and more specific error messages including
HTTP status codes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants